-
Notifications
You must be signed in to change notification settings - Fork 683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demonstrate roundtrip export/import works #2940
base: main
Are you sure you want to change the base?
Conversation
In the future, we probably should skip copyFields...
I'd love a plus one on the this before I merge... For a lot of multi step processes, like exporting and importing data, I find modeling them as BATS tests makes them easier to understand in context. I know that moves these BATS tests towards being integration or even system style tests... I suspect that we may need to split the bats tests into system/integration/unit tests in the future.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Love the how-to nature of a BATS test here.
solr/solr-ref-guide/modules/deployment-guide/pages/solr-control-script-reference.adoc
Outdated
Show resolved
Hide resolved
…l-script-reference.adoc Co-authored-by: David Smiley <[email protected]>
I have a dream someday to re-organize all the Solr documentation along the lines of Diátaxis. I've used this model a few times and quite like it. |
@@ -165,6 +165,20 @@ Once the alias is in place and you are satisfied you no longer need the old data | |||
One advantage of this option is that you can switch back to the old collection if you discover problems our testing did not uncover. | |||
Of course this option can require more resources until the old collection can be deleted. | |||
|
|||
=== Exporting/Importing Data from Solr | |||
|
|||
Sometimes you don't want to run your full ETL pipeline to reindex into another collection, you just want to take the data in your existing collection, export it, and then import it back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[-0] AFAIK bin/solr export
uses the /export
handler and can only return fields that have docValues enabled.
That's a huge limitation that we should probably mention here! Imagine someone's confusion when they follow these docs and somehow lose all of the text fields they were searching on!
|
||
There are a number of third party tools that do this, see https://solr.cool/ for more information. However, if you want to use what ships with Solr then we have some options: | ||
|
||
1. Use `bin/solr export` with the JSON output format (`.json`), and the `bin/solr post` tool to post that data back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Q] How would a user choose between these three options? Or put differently - why would they choose one over the others?
If there's no strong differentiator between the three, is there value in mentioning them all individually?
https://issues.apache.org/jira/browse/SOLR-13689
Description
Trying to understand best ways of round tripping data. Export from one collection and index into another collection. Use our existing tooling as much as possible.
Solution
Starting with a BATS test to demonstrate that bin/solr export and bin/solr post with .json file works.
Tests
BATS.